Classifier Combination for Contextual Idiom Detection Without Labelled Data
نویسندگان
چکیده
We propose a novel unsupervised approach for distinguishing literal and non-literal use of idiomatic expressions. Our model combines an unsupervised and a supervised classifier. The former bases its decision on the cohesive structure of the context and labels training data for the latter, which can then take a larger feature space into account. We show that a combination of both classifiers leads to significant improvements over using the unsupervised classifier alone.
منابع مشابه
Emotion Detection in Persian Text; A Machine Learning Model
This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...
متن کاملEtymology, Contextual Pragmatic Clues, and Lexical Knowledge in L2 Idioms Learning
To investigate the effects of etymological elaboration, contextual pragmatic clues, and lexical knowledge on L2 idioms comprehension and production, 60 male intermediate level EFL students in three groups were selected. Each group was randomly assigned to one treatment condition. Group one participants were presented with the etymological explanation of idioms. In group two, the same idioms wer...
متن کاملComputing Linear Discriminants for Idiomatic Sentence Detection
In this paper, we describe the binary classification of sentences into idiomatic and non-idiomatic. Our idiom detection algorithm is based on linear discriminant analysis (LDA). To obtain a discriminant subspace, we train our model on a small number of randomly selected idiomatic and non-idiomatic sentences. We then project both the training and the test data on the chosen subspace and use the ...
متن کاملAutomatic Idiom Identification in Wiktionary
Online resources, such as Wiktionary, provide an accurate but incomplete source of idiomatic phrases. In this paper, we study the problem of automatically identifying idiomatic dictionary entries with such resources. We train an idiom classifier on a newly gathered corpus of over 60,000 Wiktionary multi-word definitions, incorporating features that model whether phrase meanings are constructed ...
متن کاملAutomatic Sleep Stages Detection Based on EEG Signals Using Combination of Classifiers
Sleep stages classification is one of the most important methods for diagnosis in psychiatry and neurology. In this paper, a combination of three kinds of classifiers are proposed which classify the EEG signal into five sleep stages including Awake, N-REM (non-rapid eye movement) stage 1, N-REM stage 2, N-REM stage 3 and 4 (also called Slow Wave Sleep), and REM. Twenty-five all night recordings...
متن کامل